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Abstract 

Neuropeptides have been discovered in many arthropod species including crustaceans. The nature of their biological 
function is well studied and varies from behavior modulation to physiological regulation of complex biochemical processes 
such as metabolism, molt and reproduction. Due to their key role in these fundamental processes, neuropeptides are often 
targeted for modulating these processes to align with market demands in commercially important species. We generated a 
comprehensive transcriptome of the eyestalk and brain of one of the few commercially important spiny lobster species in 
the southern Hemisphere, the Eastern rock lobster Sagmariasus verreauxi and mined it for novel neuropeptide and protein 
hormone-encoding transcripts. We then characterized the predicted mature hormones to verify their validity based on 
conserved motifs and features known from previously reported hormones. Overall, 37 transcripts which are predicted to 
encode mature full-length/partial peptides/proteins were identified, representing 21 peptide/protein families/subfamilies. 
All transcripts had high similarity to hormones that were previously characterized in other decapod crustacean species or, 
where absent in crustaceans, in other arthropod species. These included, in addition to other proteins previously described 
in crustaceans, prohormone-3 and prohormone-4 which were previously identified only in insects. A homolog of the 
crustacean female sex hormone (CFSH), recently found to be female-specific in brachyuran crabs was found to have the 
same levels of expression in both male and female eyestalks, suggesting that the CFSH female specificity is not conserved 
throughout decapod crustaceans. Digital gene expression showed that 24 out of the 37 transcripts presented in this study 
have significant changes in expression between eyestalk and brain. In some cases a trend of difference between males and 
females could be seen. Taken together, this study provides a comprehensive neuropeptidome of a commercially important 
crustacean species with novel peptides and protein hormones identified for the first time in decapods. 
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Introduction 

The Eastern rock lobster Sagmariasus verreauxi is one of a few 
closely related species which constitute the spiny lobster fishery 
industry in the Southern Hemisphere [1]. Identifying the 
molecular components which govern fundamental processes in 
this species might thus prove useful in further enhancing the 
aquaculture industry of this taxonomic group. Neuropeptides and 
protein hormones have long been suggested as targets for 
crustacean aquaculture enhancement [2,3]. They govern a wide 
array of physiological and behavioral processes and have been 
studied extensively in crustaceans [4] . Neuropeptides are translat- 
ed as larger precursors (usually known as prepro-peptides) which 
include a signal peptide at their N-terminus. The signal peptide 
directs the prepro-peptide translation into the rough endoplasmic 
reticulum, where the signal peptide is being cleaved off, leaving the 
pro-peptide which is then further processed prior to the secretion 
of the mature peptide [4] . 



The list of putative neuropeptide sequences from different 
crustacean species has considerably increased over the past few 
years with the employment of bioinformatic mining in publicly 
available databases [5] , de novo transcriptome assemblies [6-9] and 
mass spectrometry [10-13]. With the expansion of the crustacean 
neurohormone database, identification of the conserved features of 
the mature neurohormones further enables mining of novel 
neurohormones through de novo transcritomes of crustacean species 
where neurohormones were not previously identified. Compari- 
sons with other arthropod species where neuropeptidomes have 
been characterized [14-21] enable insights into species' life history 
as in the case of the parasitic wasp Nasiona vitripennis [14] and the 
social honeybee Apis mellifera [15] and evolution, as in the case of 
the fruit fly Drosophila sp. [19] and the silk moth Bombyx mori [20]. 

With the recent rapid advancement in transcriptome sequenc- 
ing capabilities, it becomes increasingly affordable to establish 
comprehensive transcriptomes of non-model organisms. We 
collected RNA from several key tissues that are known to be the 
primary sites of neuropeptide production and secretion in 
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Figure 1. Type A allatostatin precursors predicted partial ORFs and conserved motif. A) N-terminus ORF (derived from Unigene56418_AII) 
with signal peptide (highlighted in red),10 predicted allatostatin peptides (highlighted in green), and amidated glycine (highlighted in light blue), 
separated by carboxyl-peptidase cleavage sites (underlined). B) Middle part ORF (derived from Unigene36127_AII) with 8 predicted allatostatin 
peptides (highlighted in green), separated by carboxyl-peptidase cleavage sites (underlined). C) C- terminus ORF (derived from Unigene45628_AII) 
with 4 predicted carcinustatin peptides (highlighted in green) and amidated glycine (highlighted in light blue), separated by carboxyl-peptidase 
cleavage sites (underlined). Asterisk indicates the stop codon. D) Type A allatostatin peptides conservation: 22 predicted neuropeptides of 8-10 aa in 
length derived from 3 putative partial transcripts with XXXXYXFGLamide conserved. 
doi:1 0.1 371 /journal. pone.0097323.g001 



crustaceans and generated a comprehensive transcriptome of S. 
verreauxi. These tissues included the eyestalk, where the X-organ- 
sinus gland (XOSG) neuroendocrine complex resides, the thoracic 
ganglia and brain. From the transcriptomic data obtained, we 
compiled a list of the putative neuropeptides and protein 
hormones and characterized them via comparisons to previously 
reported neuropeptides to predict the processing of prepro- 
pep tides into mature neuropeptides. The conserved motifs were 
identified and highlighted, providing a database that might prove 
useful for further identification of neuropeptides in closely related 
species. 

Results 

Allatostatins 

Three transcripts were identified to putatively encode partial 
type A allatostatin precursors representing the N-terminus, 
middle region and C-terminus, with 248, 154 and 93 amino acids 
(aa), respectively (Table 1 and Fig. 1). The precursor N-terminus 
has a predicted signal peptide of 27 aa, followed by 10 predicted 
neuropeptides, separated by dibasic proteinase cleavage sites 
(Fig. lA), while the middle and C-terminus contain 8 and 4 
predicted neuropeptides (respectively), also separated by dibasic 
proteinase cleavage sites (Fig. IB, C). The 22 predicted 
neuropeptides are 8 residues in length with YXFGLamide highly 
conserved motif at the C-terminus of each peptide (Fig. ID). Using 
BLAST of the mature neuropeptides individually, they were 
shown to have either high similarity, or, for most, exact identity to 
other type A allatostatins, primarily from decapod crustacean 
species, apart from two who were most similar to insect species. 



Most of the Eastern rock lobster putative type A allatostatin 
neuropeptides (17/22) had highest homology to type A allatostatin 
of the spiny lobster Panulirus intermptus (Table 2). All three type A 
allatostatin-encoding transcripts were found to have comparable 
expression levels with significantly higher expression in the brain, 
compared to the eyestalk (Table 3). 

Two transcripts were identified to putatively encode partial 
type B allatostatin precursors representing the N-terminus and 
C-terminus, with 152 and 135 aa, respectively (Table 1 and Fig. 2). 
The N-terminus has a predicted signal peptide of 33 aa, followed 
by 8 predicted neuropeptides, separated by dibasic proteinase 
cleavage sites (Fig. 2 A), while in the C-terminus there are 5 
predicted neuropeptides, separated by dibasic proteinase cleavage 
sites (Fig. 2B). The 13 predicted neuropeptides are 9-14 aa in 
length with XXDWXXXXXXGX Wamide conserved motif 
(Fig. 2C). BLAST identified 7 of the above 13 neuropeptides in 
type B allatostatin of the caridean shrimp Pandalus japonica, while 
the other 6 appear to be novel (Table 2). Both transcripts were 
found to have comparable expression levels with significantly 
higher expression of the N-terminus in the eyestalk, compared to 
the brain (Table 3). 

One transcript was identified to putatively encode a complete 
type C allatostatin precursor with 141 aa, starting with a signal 
peptide of 22 aa, followed by 3 putative neuropeptides, separated 
by dibasic proteinase cleavage sites (Fig. 3A). The predicted 
neuropeptides are 14-15 aa in length with no homology between 
them. The peptide at the precursor C-terminus has two cysteine 
residues characteristic of other allatostatins (Fig. 3A). Two of the 
three neuropeptides shared high identity with type C allatostatin 
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Table 2. Alphabetical list of peptides and their best BLAST hit. 




Hormone 


Best BLAST hit 


Accession number 


Identity 


Allatostatin A 


HNNYAFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


TPDYAFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


EGMYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


DGMYSFGLa 


ADLFSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


SGNYNFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


SQYAFGLa 


A-type allatostatin [Amphibalanus amphitrite] 


AFK81929 


100% identity 


SKLYSFGLa 


FGLa-related allatostatin [Nilaparvata lugens] 


BAO00953 


QKLYSFGLa 


NRQYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


1 00% identity 


SQQYAFGLa 


type-a prepro-allatostatin [Macrobrachium nipponense] 


AEX86939 


100% identity 


PRNYAFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


PTAYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


PTTYSFGLa 


TASYGFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


1 00% identity 


SDLYDNDLGRSYDFGL 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


SDSYDNGLGRRSYDFGL 


SGPYAFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


GGPYAFGLa 


type-a pre-proallatostatin [Macrobrachium rosenbergii] 


AAY82901 


100% identity 


ADLYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


1 00% identity 


ADPYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


AGQYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


1 00% identity 


AGPYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


EDSPASDAYTL 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


EDSSASDPYIL 


SGSYSFGLa 


type-a prepro-allatostatin [Macrobrachium nipponense] 


AEX86939 


100% identity 


AGPYSFGLa 


allatostatin precursor protein [Panulirus interruptus] 


BAF64528 


100% identity 


Allatostatin B 


TDWSSMHGTWa 


B-type preproallatostatin II [Pandalopsis japonica] 


AFV91539 


ADWSSMRGTWa 


PDLLQAPLQAVGD 


Na 






GNWDKFHGSWa 


B-type preproallatostatin II [Pandalopsis japonica] 


AFV91539 


ANWNKFQGSWa 


AEEIQAAED 


Na 






ADWNKFHGSWa 


Na 






GDEFASPELETTED 


Na 






ANWNKFHGSWa 


B-type preproallatostatin II [Pandalopsis japonica] 


AFV91539 


ANWNKFQGSWa 


GDDLVDAEL 


Na 






DWSSLQGTWa 


B-type preproallatostatin 1, partial [Pandalopsis japonica] 


AFV91539 


GWSSLQGSWa 


DWNNLHGAWa 


B-type preproallatostatin 1, partial [Pandalopsis japonica] 


AFV91539 


AWKNLHGAWa 


SPDWNSLRGAWa 


B-type preproallatostatin 1, partial [Pandalopsis japonica] 


AFV91539 


SGDWNSLRGAWa 


APDWAQFRGSWa 


B-type preproallatostatin 1, partial [Pandalopsis japonica] 


AFV91539 


DGDWSQFRGSWa 


VPDEVNETAAHQA 


Na 






Allatostatin C 


ALGEEQLQEEAAKS 


Na 






MFAPLSGLPGELPTI 


C-type preproallatostatin [Pandalopsis japonica] 


AFV91540 


LFAPLSGLPGEIPTM 


QIRYHQCYFNPISCF 


C-type preproallatostatin [Pandalopsis japonica] 


AFV91540 


QIRYRQCYFNPISCF 


Hormone-1 


SYWKQCAFNAVSCFa 


prohormone-1 isoform X2 [Apis mellifera] 


XP_006570429 


100% identity 


Bursicon alpha subunit 


bursicon [Procambarus clarkii] 


ADY80040 


90% identity 


Corazonin 


TFQYSRGWTNa 


Pro-corazonin [Harpegnathos saltator] 


EFN88292 


100% identity 


Crustacean cardioactive peptide 


crustacean cardioactive peptide [Homarus gammarus] 


ABB46292 


81% identity in 75% cover 


Crustacean female sex hormone 


crustacean female sex hormonoe, partial [Carcinus maenas] 


AEI72264 


26% identity 


Crustacean hyperglycemic hormone (CHH) 
isoform B1 


prepro-crustacean hyperglycemic hormone isoform B 
[Nephrops norvegicus] 


AAQ22392 


82% identity 
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Table 2. Cont. 





Hormone 


Best BLAST hit 


Accession number 


Identity 


CHH isoform B2 


crustacean hyperglycemic hormone isoform 2 
[Rimicaris kairei] 


ACS35347 


81% identity 


CHH isoform B3 


CHH-like protein precursor [Procambarus clarkii] 


AF474408 


64% identity 


CHH isoform B4 


prepro-crustacean hyperglycemic hormone isoform B 
[Nephrops norvegicus] 


AAQ22392 


85% identity 


CHH unspecified 


hyperglycemic hormone [Pandalopsis japonica] 


AFG 16932 


59% identity 


Molt inliibiting liormone (MIH) isoform 1 


Molt-inhibiting hormone [Orconectes limosus] 


P83636 


55% identity 


MIH isoform 2 


Probable molt-inhibiting hormone [Jasus lalandi] 


P83220 


70% identity 


MIH isoform 3 


Vitellogenesis inhibiting hormone [Homarus gammarus] 


ABA42181 


72% identity 


Diuretic hormone 


prepro-calcitonin-like diuretic hormone 
[Homarus americanus] 


ACX46386 


90% identity 


Eclosion hormone isoform 1 


eclosion hormone 2 [Nilaparvata lugens] 


BAO00951 


62% identity 


Eclosion hormone isoform 2 


eclosion hormone 1 [Nilaparvata lugens] 


BAO00950 


49% identity 


FLP (myosupressin) 


myosuppressin-like neuropeptide precursor 
[Procambarus clarkii] 


BAG68789 


86% identity 


Follystatin isoform 1 


follistatin-like, partial [Nematostella vectensis] 


ABF61774 


54% identity 


Follystatin isoform 2 


follistatin-related protein 1 isoform 1 [Odobenus rosmarus 
divergens] 


XP_004403583 


38% identity 


Myostatin 


MSTN [Penaeus monodon] 


AD034177 


65% identity 


Neuropeptide Y 


neuropeptide Y [Lymnaea stagnalis] 


CAB63265 


57% identity 


Neuroparsin isoform 1 


neuroparsin [Jasus lalandii] 


AHG98659 


97% identity 


Neuroparsin isoform 2 


neuroparsin [Jasus lalandii] 


AHG98659 


48% identity 


Orcokinin 


FDAFTTGFGHSKR 


Orcokinin [Procambarus clarkii] 


Q9NL83 


100% identity 


NFDEIDRSGFAFAKK 


Orcokinin [Procambarus clarkii] 


Q9NL83 


NFDEIDRSGFGFAKK 


NFDEIDRAGLGFAKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


NFDEIDRSGFGFNKR 


NFDEIDRSGFGFNKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


1 00% identity 


NFDEIDRAGLGFHKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


NFDEIDRSGFGFHKR 


NFDEIDRSGFGFNKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


1 00% identity 


NFDEIDRTGFGFHKR 


Orcokinin [Procambarus clarkii] 


Q9NL83 


100% identity 


DYDGVYPDKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


DYD-VYPEKR 


NFDEIDRAGFGFVKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


NFDEIDRSGFGFVKR 


AFGPRDISNLYKR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


VYGPRDIANLYKR 


NFDEIDRSGFGFVRR 


prepro-orcokinin II [Homarus americanus] 


ACD13197 


100% identity 


Pigment dispersing hormone 


NAELINSILGLPKVMNDAa 


Pigment-dispersing hormone [Uca pugilator] 


P08871 


NSELINSILGLPKVMNDAa 


NAELINSLLGIPKVMSDAa 


Pigment-dispersing hormone [Litopenaeus vannamei] 


P91963 


NSELINSLLGIPKVMNDAa 


Hormone-3 


prohormone-3 [Apis mellifera] 


XP_001 122204 


43% identity 


Hormone-4 


prohormone-4-like [Acyrthosiphon pisum] 


XP_001951503 


89% identity 


Red pigment concentrating hormone 


Red pigment-concentrating prohormone 
[Callinectes sapidus] 


Q23757 


63% identity 


Sulfakinin 


EFDEYGHMRFa 


preprosulfakinin [Homarus americanus] 


ABQ95346 


100% identity 


SGGEYDDYGHLRFa 


preprosulfakinin [Homarus americanus] 


ABQ95346 


GGGEYDDYGHLRFa 


Tachykinin 


APSGFLGMRa 


preprotachykinin [Procambarus clarkii] 


BAC82426 


1 00% identity 



Best BLAST hit shows arthropods that are not decapod crustaceans ( underlined ) and non-arthropods { italicized and underlined ). Identity of proteins is given as 
percentage and peptides as sequence with non-identical aa underlined (amidation is noted by 'a'). 
doi:1 0.1 371/journal.pone.0097323.t002 
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identified in P.japonica (Table 2). The transcript level was found to 
be significantly higher in the brain compared to the eyestalk 
(Table 3). Another transcript was identified to putatively encode a 
complete prohormone- 1 with 105 aa, starting with a signal peptide 
of 25 aa, followed by 1 putative neuropeptide, separated by dibasic 
proteinase cleavage sites (Fig. 3B). The putative neuropeptide in 
prohormone- 1 shares a conserved motif (QC X FN XX SCF) with 
the last putative peptide in the type C allatostatin (Fig. 3C), and is 
identical to the neuropeptide encoded by prohormone- 1 of insects 
(Table 2). While like allatostatin type C, prohormone- 1 has a 
significantly higher expression in the brain compared with 
eyestalk, the overall expression of prohormone- 1 is one order of 
magnitude higher compared to all other allatostatins (Table 3). 

Bursicon alpha subunit 

One transcript was identified to putatively encode a complete 
bursicon alpha subunit precursor with 142 aa, starting with a 25 aa 
signal peptide, followed by a predicted C -terminal cysteine knot- 
like domain of 89 aa which contains ten conserved cysteine 
residues (Table 1 and Fig. 4). The mature hormone share up to 
90% identity with bursicon alpha subunit identified in other 
decapod crustacean species (Table 2). The level of expression is 
very low in the brain and not evident in the eyestalk (Table 3). 

Corazonin 

One transcript was identified to putatively encode 49 aa of the 
N-terminus of the corazonin precursor, starting with a 24 aa long 
signal peptide followed by a 1 1 aa conserved peptide (identical to 
corazonin peptides of insects; Table 2) followed by a carboxyl- 
peptidase cleavage site (Table 1 and Fig. 5). Corazonin expression 
was found to be almost exclusive to the eyestalk with slight higher 
levels in females (Table 3). 

Crustacean cardioactive peptide (CCAP) 

One transcript was identified to putatively encode a complete 
139 aa open reading frame (ORF) of CCAP precursor starting 
with a 29 aa signal peptide followed by four predicted peptides (10, 
9, 52 and 23 aa in length), separated by carboxyl-peptidase 
cleavage sites. One of those peptides is highly conserved and 
contains two cysteine residues predicted to form a disulfide bridge 
and is amidated (Table 1 and Fig. 6). The highest identity level of 
the entire ORF, excluding the signal peptide was 81%, with 
another decapod crustacean CCAP, covering 75% of the ORF 
(Table 2). The transcript encoding CCAP had significantly higher 
expression in the eyestalk compared with the brain, with a higher 
expression in male eyestalk (Table 3). 

Crustacean hyperglycemic hormone (CHH) 

Five transcripts were identified to putatively encode three 
complete and two partial CHH peptide precursors with 112-139 
aa (Table 1 and Fig. 7). All three complete sequences start with a 
predicted signal peptide of 25-26 aa. One partial sequence has 
part of the signal peptide (16 aa). All 5 sequences have a CHH- 
conserved domain of 71-73 aa, preceded by a carboxyl-peptidase 
cleavage site. The 6 cysteine residues predicted to give rise to 3 
disulfide bridges are all aligned between the 5 sequences (Fig. 7A- 
E). Overall the sequence similarity between the CHH domains is 
high with up to 89% identity between isoforms Bl and B2 (Fig. 7F, 
G). Compared with previously described CHHs, identity of the 
mature hormone was between 59%-85% (Table 2). Isoforms Bl-3 
had the highest expression of all five transcripts, and found almost 
exclusively in the eyestalk, while isoform B4 had much lower 
expression (two orders of magnitude) only in the eyestalk. The 
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A 456 atgcagcagcaccaacagcagcagcagcagcagcaacggcggaca B 1537 
411 gcgtccctcctggcgctggtgctggtggcggcggcagcgggagcg 



GACTGGAGCAGCCTACAGGGC 



366 gcgggag cccaggacgacgaggctcgcccaagcgcccccgacacg 
IB^H QDDEARPSAPDT 

321 gacctcaagcgg acggactggtcctccatgcacggcacctgg ggc 
D L K R ^^■■■■■■^^■^^^■S 

27 6 aagcggcctgacctgctccaggcacctctc caggctgttggag ac 

K R § i ii li 1 i i^^ VHH^HI 

231 aaacgaggtaacLgggacaagttccacggg tcttggg gcaagcga 

^_p^ i nil 111 illii ill F H G 
186 gccgaggagatccaggctgct gaggacaagagg gcagactgga ac 

I^^^^^^^^^^^^Bp ^ ^ IHHHHI 

141 aagtttcacggatcctggg gcaaacgtggagacga attcgcca gt 

■BIHIHIHH^ ^ G D E ^^H^l 

96 cccgaacttgagactactgaag acaagaga gcaaactggaata aa 

■■■P"7'^'"l^^Hi K R ^■■I^^^H 

51 ttccacg gttcctggggcaagaga ggagacgacttggtcgacg ct 

ll^^lllHiBifiiiQ IS ^ 

6 gaactc 1 



1516 ACCTGG GGCAAGCGG GATTGGA ACAACCTTCACGGGGCCTGGGGA 

14 71 AAGCGTTCTTCTGATGACGGTGACGACCTAGACGATGAGACGACC 
K R SSDDGDDLDDETT 

1426 atggaggaggagctggccgaggaacagatgtccccagtggcgttg 
MEEELAEEQMSPVAL 

1381 gctaggctgatgctagcagctccccaaaagcgt ggctggacgc tg 
ARLMLAAPQ K R 

1336 tgggggaagcgg cccgacagtgcccgcgtctccccacgctcca cc 

G K R BHHHHBHIII^^^H^^H 

1291 aactggtccagcctcagaggtacgt ggggtaagcgcagcccagac 
^^^■^^^■^^^^^^^H K R I^^IH 

124 6 tggaacagcttgaggggcgcctggggcaagcgtgcccctgactgg 
■Bl S L R G A W G K R A P D ^ 

1201 gcacagttccgtggctcctgg ggcaagagg gttcccgatgaag tg 
^^■■■■■■■■i K R 

1156 aacgagactgcggctcaccaggcttag 1130 



/( 

.PHLLQ 

.gnIS. . 

A 

. a[351. . 

GDEFAS 
. AWl 



GD 



SP 
AP 
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ssmhStEB 

aplqavgd 
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peletted 
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gEV.NETAAHQA 



flLVD 

3 



Figure 2. Type B allatostatin precursors predicted partial ORFs and conserved motif. A) N- terminus ORF (derived from Unigene40422_AII) 
with signal peptide (red) and 8 predicted allatostatin peptides (green) with amidated glycine (light blue), separated by carboxyl-peptidase cleavage 
sites (underlined). B) C- terminus ORF (derived from Unigene25318_AII) with 5 predicted allatostatin peptides (green), separated by carboxyl- 
peptidase cleavage sites (underlined). Asterisk indicates the stop codon. C) Type B allatostatin peptides conservation: 13 predicted neuropeptides of 
9-14 aa in length derived from 2 putative partial transcripts with XX DW XXXXXXGX Wamide conserved. 
doi:1 0.1 371 /journal. pone.0097323.g002 



unspecified isoform had equivalent expression to that of isoform 
B4 in both the eyestalk and the brain. Interestingly, all five 
isoforms had higher levels in females compared with males 
(Table 3). 

Molt/Gonad-inhibiting hormone (MIH/GIH) 

Three transcripts were identified to putatively encode three 
complete MIH/ GIH peptide precursors with 1 1 1—1 15 aa (Table 1 
and Fig. 8). All three sequences start with a predicted signal 
peptide of 33-37 aa followed by an MIH-conserved domain of 74 
aa. The 6 cysteine residues predicted to give rise to 3 disulfide 
bridges are all aligned between all 3 sequences (Fig. 8A-C). 
Overall, the sequence similarity between the MIH domains is 



lower than the GHH isoforms with 53%-54% identity (Fig. 8D). 
Compared with previously described MIHs/GIHs, identity of the 
mature hormone was between 55%-72% (Table 2). All 3 putative 
MIH transcripts were found to be specifically expressed in the 
eyestalk with isoform A2 showing highest expression. Similar to 
GHH, all three MIH isoforms showed higher expression levels in 
females compared with males (Table 3). 

Crustacean female Sex hormone (CFSH) 

One transcript was identified to putatively encode a complete 
CFSH peptide precursor with 278 aa (Table 1 and Fig. 9). The 
sequence starts with a 22 aa signal peptide and contains 10 
conserved cysteine residues predicted to form 5 disulfide bridges 



A 246 atgatgtcttgtgcagcccacctgttggtggccgggctagccctg 

291 gcactgacaatgtcccagg cactgccggcaccacccgccgccact 

^^^^^^^^^^^B LPAPPAAT 
33 6 tccacccaccacaagaacctgcagacgcccacagcagcagatcag 

STHHKNLQTPTAADQ 
381 gacgctaggatccagaagagggccattgcctccgaacccaacgag 

D A R I Q K R AIASEPNE 

42 6 gaggagattgctaccttgaaggacctgatcctggcgcgggtggcg 

EEIATLKDLILARVA 
471 tccgagctgcaggactcgtggcaggaccttccatccctcaagaag 

SELQDSWQDLPSL K K 

516 gcccttggggaggagcagctgcaagaggaagcggccaagagcaag 

ALGEEQLQEEAAKSK 
5 61 aggatgttcgcccccctctctggtcttcccggagaactgcccacc 

RMFAPLSGLPGELPT 
60 6 attaagaggcaaatccgctaccaccagtgttacttcaaccccatc 

I K R QIRYHQCYFNPI 

651 tcatgcttcaggcgaaagtga 671 

S C F R R K * 



B 1387 atgctgacgcgaagctgtgtaagcctgatgatggtggccgtggtg 

1342 gctctggtggctgttagcagcgtgtcag ccaaggctctgcctgac 
^^^^^^^^^^^^^^^^^B K A L P D 

12 97 caggaaggccaagtcttccctcagacccagcagatgttggatccc 

QEGQVFPQTQQMLDP 
1252 tacggcaaccaccttgttgacgacgacgggtccctggacaccgcc 

YGNHLVDDDGSLDTA 
12 0 7 ctcatcaactacctcttcgccaagcagatggtggcgcgtctgaga 

LINYLFAKQMVARLR 
1162 aacagcgccgacgttaaggacctgcagatgaagcgctcctactgg 

NSADVKDLQM K R S Y W 

1117 aagcagtgcgccttcaacgccgtcagctgcttcggcaagaggaag 

K Q C A FNAVS C F G K R K 

1072 tga 1070 



C CLUSTAL 2.1 multiple sequence alignment 

Type QIRYHQCYFNPISCF- 15 

Prohormone-1 -SYWKQCAFNAVSCFG 15 



Figure 3. Type C allatostatin and prohormone-1 precursor predicted ORFs and conserved peptide. A) A complete ORF (derived from 
CL2090.Contig2_AII) of type C allatostatin precursor with a signal peptide (red) and 3 predicted allatostatin peptides (green) with an amidated glycine 
(light blue), separated by carboxyl-peptidase cleavage sites (underlined). B) A complete ORF (derived from Unigene59348_AII) of prohormone-1 with 
a signal peptide (red) and a predicted allatostatin peptide (green), separated by carboxyl-peptidase cleavage sites (underlined). Two conserved 
cysteine residues in the last allatostatin peptide of each sequence are highlighted in yellow. Asterisk indicates the stop codon. C) Amino acid 
alignment between the conserved peptides of C type allatostatin and prohormone-1. 
doi:1 0.1 371 /journal. pone.0097323.g003 
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237 atgagaagactgtcgtggtcactggtgggcgtggtggtgatggtg 
2 82 gtgggcgtggtgtggg cggacgagtgctcccccacgcccgtcatc 

32 7 cacatactctcctaccctggctgcacctccaagccaatcccttcc 
HILSYPGCTSKPIPS 

372 ttcgcttgtcagggtcgttgtacctcctacgtgcaggtgtcaggc 
FACQGRCTSYVQVSG 

417 agcaagctatggcagacagagaggtcgtgcatgtgctgccaggag 
SKLWQTERSCMCCQE 

4 62 tccagagagaaggaggcttccgtcaccctcagctgtcccaaggct 

SREKEASVTLSCPKA 

5 07 cgcagtggtgagcccaggaagaaaaagatcttaacccgagcccct 

RSGEPRKKKILTRAP 
552 atcgactgtatgtgtcggccgtgcaccgacgttgaggagagcacc 

IDCMCRPCTDVEEST 
5 97 gtgctggcccaggagatcgccaacttcatccagtattcgcccatg 

VLAQEIANFIQYSPM 
642 ggcaacgtgcccttccttaggtag 665 

GNVPFLR* 



1052 atgtcgaccatgagatggtgtggtcgtgctggcgtggtggtga ct 
1007 gccgtcgtccttctggtgctcctggctgcccagacccacg ccggg 

9 62 cccgtcgccaagagggacatcggcgacttactcgacggcaaggct 

P V A K R DIGDLLDGKA 

917 aaacgacccttctgcaacgccttcacaggctgcggtaagaagcgg 

K R PFCNAFTGCG K K R 

872 tcagaccctgagctggaggcagtagcctctggctcagaacttgac 

SDPELEAVASGSELD 
82 7 gccctggccaagcacgtcctggccgaggccaagctgtgggagcaa 

ALAKHVLAEAKLWEQ 
7 82 ctccagaacaagatggagatgatgcgcaccctggctggccgcatg 

LQNKMEMMRTLAGRM 
7 37 gatagccagcacccactgtacaggaggaagaggtccaccgcccac 

D S Q H P L Y R R K R STAR 
6 92 cagacccgccaccacctcacttcctcacctaaactgaagatggaa 

QTRHHLTSSPKLKME 
647 accgaaaagcagtga 633 

T E K Q * 



Figure 4. Bursicon alpha subunit precursor predicted ORF. A 

complete ORF (derived from CL593.Contig3_AII) of bursicon alpha 
subunit precursor with a signal peptide (red) and a predicted C-terminal 
cysteine knot-like domain (green). Ten conserved cysteine residues are 
highlighted in yellow. Asterisk indicates the stop codon. 
doi:l 0.1 371 /journal. pone.0097323.g004 



Figure 6. Crustacean cardioactive peptide (CCAP) predicted 
ORF. A complete ORF (derived from Unigene1674_AII) of CCAP with a 
signal peptide (red) and four predicted peptides (green) with an 
amidated glycine (light blue), separated by carboxyl-peptidase cleavage 
sites. Two conserved cysteine residues are highlighted in yellow. 
Asterisk indicates the stop codon. 
doi:10.1371/journal.pone.0097323.g006 



(Fig. 9), although the overall identity of the mature hormone does 
not exceed 26% with other decapod crustaceans (Table 2). CFSH 
was found to be specifically expressed in the eyestalk, with 
equivalent expression in both males and females (Table 3). 

Diuretic hormone (DH) 

One transcript was identified to putatively encode a complete 
DH peptide precursor with 135 aa (Table 1 and Fig. 10). The 
sequence starts with a 23 aa signal peptide and the active 31- 
residue DH peptide is released using dibasic proteinase cleavage 
sites. This peptide shared 90% identity with a clawed lobster DH 
(Table 2). The transcript is expressed in both brain and eyestalk 
with a non significant higher level in brain and in males (Table 3). 

Eclosion hormone 

Two transcripts were identified to putatively encode complete 
isoforms of the eclosion hormone precursor (Table 1 and Fig. 1 1) 
with 82 and 86 aa, each starting with a signal peptide of 26-28 aa, 
followed by 55-57 aa eclosion hormone domains each containing 
6 conserved cysteine residues predicted to form 3 disulfide bridges 
(Fig. 1 lA, B). Other than the cysteine residues, the similarity level 
between the two eclosion hormone domains is intermediate, with 
47% identity (Fig. IIG). Compared to other eclosion hormones, 

62 atggtgaggacttccaggcaccagctgcagacggcactccttg tg 

107 gccctcaccctaggcctggcagcgg cccagaccttccagtacagc 
^^^^^^^■^^^^^^H Q T F Q Y S 

152 agaggatggacgaacgggaggaagcgttcagaccctagcgtgggt 
R G W T N G R K R S D P S V G 

197 gtgcggagggtggg 210 
V R R V 

Figure 5. Corazonin predicted precursor ORF. A partial ORF 
(derived from Unigene32841_AII) of the N- terminus of corazonin 
precursor with a signal peptide (red) and a conserved peptide (green) 
with an amidated glycine (light blue), followed by a carboxyl-peptidase 
cleavage site. 

doi:1 0.1 371 /journal. pone.0097323.g005 



identity of ^S". verreauxi eclosion was 49%-62% with insect eclosion 
hormones (Table 2). The first isoform had a significantly higher 
expression in the eyestalk compared with the brain, and higher 
expression in males compared with females. The second isoform 
showed only a basal expression in the female eyestalk (Table 3). 

Follistatin 

Two transcripts were identified to putatively encode a complete 
(133 aa) and a partial (204 aa) isoforms of the follistatin precursor 
(Table 1 and Fig. 12), each starting with a signal peptide of 15 aa, 
followed by identical 23 aa follistatin domains each containing 4 
conserved cysteine residues predicted to form 2 disulfide bridges 
(Fig. 12A, B). In each predicted peptide, the follistatin domain is 
followed by a 45 aa kazal-type serine protease inhibitor domain 
whose N-terminus is identical between the isoforms with 5 cysteine 
residues and the C -terminus contains 2 additional cysteine residues 
in the partial isoform (Fig. 12C). The shorter, yet complete 
foUistatin-like isoform ends with a 23 aa predicted transmembrane 
region. The mature hormones showed identity of 38%-54% to a 
cnidarians and a mammalian species' foUistatins (Table 2). The 
first transcript had a very low expression in all tissues and the 
second transcript had very low expression and was exclusively 
found in the female brain (Table 3). 

Myostatin 

One transcript was identified to putatively encode a complete 
4 1 9 aa ORF of a myostatin precursor, starting with a 1 8 aa signal 
peptide, followed by a 136 aa TGF-beta propeptide domain, 
followed by another 96 aa TGF-beta domain (Table 1 and Fig. 13). 
The mature hormone showed 65% identity with another decapod 
crustacean myostatin (Table 2). Myostatin showed significantly 
higher expression in the eyestalk compared to the brain (Table 3). 

Myosupressin 

One transcript was identified to putatively encode a complete 
myosupressin peptide precursor with 100 aa (Table 1 and Fig. 14). 
The sequence starts with a 29 aa signal peptide and the active 10- 
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A 924 atgatcgcctgtagaacgatgtggctggccgtggtggi 



jggtcc 



R S T 



1 ggcctggcccgcctggagaggctcctt 

GLARLERLLSSSSSS 

;ctgggactcctttccggcgccgaccacagcctcaacaag 
' GLLSGADHSLNK 
jttcgaccagtcgtgtaagggggtgtacgaccgctcg 



699 



^ggtgc 



aggactgctc 



654 ^^^^^^^^^^^^^^^^^^^gt^^^^g^^^gt^^ac 

564 gacgtggtcgacgagtacgtgaccactgtacagatggtaggcaag 



519 



i 517 



28 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^g;g 

173 gtggtgggcctggcctcctgtggggcccaggcgcgggccacggag 



RATE 
jgcggattcgtcgtcc 



218 ggcctgactcgcatggagaag^ 

GLTRMEKLFSADSSS 
263 tccgcccccctggacaccgccgaccacagcctcaacaagagggcg 

SAPLDTADHSLN K R | 

308 gtgttcgaccagtegtgtaagggggtgtacgaccgctcgctcttc 



353 a 



:atagg 



488 gtcgacgagtacgtgaccactgtacagatggtaggcaagtaa 529 



1 TCCTCCGGCGTGACGGGTCGCTCCACCAACGGCCTGGCCCGCCTG 

SSGVTGRSTNGLARL 
4 6 GAGAGGCTCCTTTCCTCCTCCTCCTCCTCCACT AACCTGGGACTC 

ERLLSSSSSSTNLGL 
9 1 CTTTCCGGCGCCGACCACAGCCTCAACftAGAGG GCGGTGTTCGA C 

LSGADHSLN K R 

136 CAGTCG T GTAAGGGGGTGTACGACCGCTCGCTCTTCAAGAAGC TG 

181 GAGGTGGTG T GCGAGGAC T GCTATAACCTCTACAGGAAGCCCTA T 

226 GTGGCCACCGGC T GCAAAGAGAAC T GTTTCGCCAGAGATGTTT TC 

271 CCCATGTGTGTGGAATCCCTGGGCTTGGACCTGGACCTATACC TG 

316 GCCATCAGGGCCATGCTCCAGTGA 339 



53 tggctggccgtggtggtggtggtcctcacgtcctccggcgtgac g 

408 ggtcgctccaccaacggcctggcccgcctggagaggctcctctcc 
|rstnglarlerlls 
3 tcctcctcctcttcctccacgcccctgggattcctctccggcgcc 



s s s s 



L G F L S G 



3 gaccacagcctcaacaagagagtggtattcgaccagtcgtgtaag 



273 ggggtgtacgaccgctcgctcttcaagaagctggatgtggtg tgc 
228 gacgactgttacaatctctacaggaagccccacgtggcttcctcc 



.83 t gcagggcaaac t gctacagcaacagggtgttcggccag tgtctg 
.38 gacgacctgctcctagtggatgtcatcgacgagtacgtctccact 



jtagtgtctgtggtc 



jtggccgtcgtc 



15C 



:ttggtci 



S P 



;cgatggc 



14 62 gttgaggatctacttcagtccttagtggcatcgtcttcctccgtc 
VEDLLQSLVASSSSV 

1417 gccgcgccccaagatgacagtatacaaggagacgcaaacatgctg 
AAPQDDS IQGDANML 

1372 taccacagtatcactaaaagggc 



1327 



:gtgt 



1282 gaggactgt( 



aggtttagatgtggac 



1192 ^^^^^^^^^^^^^g^^^^^^^^^ggc 

1147 ttagtacgaggctaa 1133 



CHH isoform 81/1-/3 

CHH isoform B2/l-1i 

CHH isoform 83/ 1-/1 

CHH isoform 84/1-/3 

CHH unspecified/l-/l 



CHH isoform 81/1-/3 

CHH isoform 82/1-/3 
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95 |~|ZZ 
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Figure 7. Crustacean hyperglycemic hormone (CHH) precursors predicted complete and partial ORFs with high similarity levels. A- 

E) Complete and partial CHH isoforms 1-5 (derived from CL7809.Contig1_AII, CL7809.Contig3_AII, CL7809.Contig4_AII, Unigene30324_AII and 
Unigene34312_AII) with signal peptides (red) and predicted CHH domains (green) with an amidated glycine (light blue), preceded by carboxyl- 
peptidase cleavage sites (underlined). Six conserved cysteine residues predicted to form 3 disulfide bridges are highlighted in yellow. Asterisk 
indicates the stop codon. F) CHH domains conservation: the 71-73 aa domains show high level of similarity with each other. G) CHH domains 
phylogenetic tree showing similarity levels are highest between isoforms Bl and B2, followed by isoform 84, then isoform 83 and furthest is the 
unspecified CHH isoform. Scale bar represents number of substitutions per site. 
doi:1 0.1 371 /journal. pone.0097323.g007 



residue myosupressin peptide is released using dibasic and arginine 
proteinase cleavage sites. Overall the prohormone showed 86% 
identity with myosupressin of the penaeid shrimp Penaeus monodon 
(Table 2). Myosupressin showed similar expression in the eyestalk 
and the brain (Table 3). 



Neuropeptide Y (NPY) 

One transcript was identified to putatively encode a complete 
NPY precursor with 104 aa (Table 1 and Fig. 15). The sequence 
starts with a 26 aa signal peptide followed by a 36 aa pancreatic 
hormone/neuropeptide F/peptide YY family domain, which 



A 356 


atgccacaagtaacggtatctcgaatggtgccaagctcctctgtg 


B 982 


atgaatatgccacaagtgactgtagtgccaggcttctctgtacag 


311 


aagaaagtatggttactgctggtggttgctctcctgggtagcttc 


937 


aggatgcggctgttactcttgatagctctcgttggaagcgtctta 


266 


ctggtggagcagtcatcggccaggtttatcttcaacgaatgcccc 
■■■■■■■■ ■■■i^^B E C t 


892 


gtcgggcagtcttcggctagattctccttctacgaatgccctgga 


221 


ggcatgataggcaaccgagccttgtacaacaaggtggaggcagtg 


847 


atgatgggtcagcgagacttgtacgacaaagtggaacaagtttgt 










176 


tgcagtgattgctacaacttgtaccgtaatgatcagttggaaatg 


802 


gacgactgctacaatatctaccgcaaagaagaggttgccgtggag 










131 


aattgcaggaaagactgcttcgccaacaagagtttcctcctgtgt 
ctcttggctttgtttagaagcggcgaactcaaagacttcaatcgt 


757 


tgcagggaaggctgtttcatcaaccccaggttctccatgtgcctc 


86 


712 


tatgctacgatgagagaccatgaaacccaacgcttcaatgtttgg 


41 


tggatcagcattcttaacgctggacgaaagtaa 9 


667 


cgtagcattcttaaagctggacgaaagtaa 638 



304 atggtaactcaggcgtcaggcctctctctacagagggtttgtgtg 

349 ttggtactggcagcagtcgttctgggaggtttcctagcacagg aa 

394 acgtcag ccagattccttgacgacgaatgtcaaggtgcaatgggc 
^^^H B^FLD D E C QG AMg 

439 aaccgagacatctacaagaaggtggcgtgggtc t gtgacgac tgc 

4 84 gctaacatcttccgtaataatgacgtcggaataatat gtagga ag 
ANIFRNNDVGII cHHi 

52 9 gac t gtttccacaacagagacttcatgtgg t gtctctatgcca cg 

574 gaacgccacggggagctggagaacttcaagaggtggatcagtatt 

619 ctcagcgcaggacgcaagtga 639 



Al/1-74 
A2/1-74 
A3 / 1-7 A 
conservation 
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Al/1-74 
A2/1-74 
A3 / 1-7 4 
conservation 
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CRj^CF 
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Figure 8. MIH predicted complete and partial ORFs with intermediate similarity. A-C) Complete MIH isoforms 1-3 (derived from 
Unigene47171_AII, Unigene60521_AII and Unigene58466_AII) with signal peptides (red) and predicted MIH domains (green) with an amidated 
glycine (light blue). Six conserved cysteine residues predicted to form 3 disulfide bridges are highlighted in yellow. Asterisk indicates the stop codon. 
D) MIH domains conservation: the 74 aa domains show intermediate level of similarity with each other. 
doi:l 0.1 371 /journal.pone.0097323.g008 
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171 atgttgcagcagctggtgatacaactggcgctggcctgggcct gt 

216 acagtgctggtggccgccg cctctggcagtcaagacgctgctctc 
^^^^^^^^^^^^B SGSQDAAL 

2 61 caagccttcggtaaagatggccagcatgaagagtggccctggtcg 

QAFGKDGQHEEWPWS 
30 6 cctccacagtggtggtggctcagtaacgttctctccttctcccgc 

PPQWWWLSNVLSFSR 
351 ggccacctgcatggccaggcctccgcctcacctggcaccacggcc 

GHLHGQASASPGTTA 

3 96 ctcagcacacaggatcagcagacctcaccgctctctgtgctcctg 

LSTQDQQTSPLSVLL 
441 cctctggagggagcgggcgagggcgacaaggtgaaggaggaggcg 
PLEGAGEGDKVKEEA 

4 8 6 tggcgggtggggaagcggtcccgggtctgcaggtctggagagaag 

WRVGKRSRVCRSGEK 
531 ggcgcctgtgtcaccggcctgatctccttcacggaggtgtggcag 

GACVTGLISFTEVWQ 
57 6 ggctggaaggatgactacctctccgtgccgcaggccatggtcaag 

GWKDDYLSVPQAMVK 
621 ttctcccaagagcaggcgggggacaacgtctgtaaggacctctcc 

FSQEQAGDNVCKDLS 
666 gtgcagctcttcagcgtggacctgagggagcaccacatagagcca 

VQLFSVDLREHHIEP 
711 ctgtgggtgcgggagaccgtctacatcggcatgtgtccctccaga 

LWVRETVYIGMCPSR 
75 6 ctccagacgcgtcacctaggtgataacgtgtggcctcccaaagtg 

LQTRHLGDNVWPPKV 
8 01 gtggagaccaagtgtctgtgtcagcggcagtcctgctccaacctg 

VETKCLCQRQSCSNL 
84 6 ggcggcgacttcctgtgtcaggcggtgcgacgccctgtcacggtc 

GGDFLCQAVRRPVTV 
8 91 tggctgcggcgagacaagaccttcctgccctcccaggagatgctc 

WLRRDKTFLPSQEML 
93 6 tccgtgggctgcgtctgtgtccagcgcatcagcacccagggccgg 

SVGCVCVQRISTQGR 
981 tacgccgacccgggactgtcctcctag 1007 

YADPGLSS^ 

Figure 9. CFSH precursor predicted complete ORF. A complete 
CFSH like peptide (derived from Unigene481 18_AII) with a signal 
peptide (red) and 10 conserved cysteine residues predicted to form 5 
disulfide bridges (yellow). Asterisk indicates the stop codon. 
doi:1 0.1 371 /journal. pone.0097323.g009 

showed 57% identity with an NPY from a moUusk (Table 2). 
Neuropeptide Y showed significantly higher expression in the 
eyestalk compared to the brain (Table 3). 

Neuroparsin 

Two transcripts were identified to putatively encode complete 
neuroparsin peptide precursors with 103-102 aa (Table 1 and 
Fig. 16A,B). Both sequences contain a 93-101 aa neuroparsin 
domain with very low similarity (44% identity), although all 12 
cysteine residues, predicted to form 6 disulfide bridges are aligned 
(Fig. 16C). Although the similarity between the two isoforms was 
rather low, both showed similarity to the same neuroparsin of a 
spiny lobster (97% and 48%; Table 2). The first neuroparsin 
encoding transcript had higher expression compared with the 
second transcript. In both cases the expression was not signifi- 
cantly different between tissues, due to high variation between 
males and females (Table 3). 

Orcokinin 

One transcript was identified to putatively encode a complete 
orcokinin peptide precursor with 205 aa (Table 1 and Fig. 17), 
starting with a signal peptide of 20 aa, followed by 1 1 putative 
neuropeptides, separated by dibasic proteinase cleavage sites 
(Fig. 17 A). The predicted neuropeptides are 8-13 aa in length with 



213 atgaccaacacaggcgccgtcttcgcttctctggtgctggccgtc 

25 8 atcttcctgtcctcggtcaactcggtccccctcaacagggagacg 
^^^^^^^^^^^^^B V P L N R E T 

3 03 cgggcggtggtggagatagaggacccggactacgtgctggagctg 
RAVVEIEDPDYVLEL 

34 8 ctgaccagactgggacactccatcatcagggccaatgagttagaa 
LTRLGHSIIRANELE 

3 93 aaattcgtgcgttcctccggcagcgccaagcgaggactggacctg 

KFVRSSGSA K R G L D L 

438 ggtctaggcaggggcttcagtggttcccaggcagccaaacatctg 
GLGRGFSGSQAAKHL 

4 83 atgggccttgccgccgccaactatgctggaggccctggcaggagg 

MGLAAANYAGGPG R R 

52 8 aggagaagccctgaggacaccctcgacctccaccatgacgacacc 

R R SPEDTLDLHHDDT 

57 3 ctctatgcccatgatcaagctgccgatgtggcagagtcaacacga 

LYAHDQAADVAESTR 
618 taa 620 



Figure 10. DH precursor predicted complete ORF. A complete 
DH-like peptide precursor (derived from CL8244.Contig1_AII) with a 
signal peptide (red) and aconserved peptide (green) with an amidated 
glycine (light blue), bordered by carboxyl-peptidase cleavage sites.As- 
terisk indicates the stop codon. 
doi:10.1371/journal.pone.0097323.g010 

NFDEIRDRXGFGF X as the most conserved motif (Fig. 17B). 
All 1 1 neuropeptides had high homology (5 identical) with 
orcokinin of either the clawed lobster Homarus americanus or the 
red swamp crayfish Procambams clarkii (Table 2). Orcokinin showed 
higher expression in the male brain compared with the female 
brain, with similar expression in the eyestalk and the brain 
(Table 3). 

Pigment dispersing hormone (PDH) 

Two transcripts were identified to putatively encode complete, 
highly similar isoforms of PDH precursors (Table 1 and Fig. 1 8) 
with 79 aa, both starting with an identical signal peptide of 22 aa, 
followed by a 23 aa transmembrane region in only one isoform, 
followed by a carboxy-peptidase cleavage site prior to an 18 aa 
PDH domain in both isoforms (Fig. 18A, B). Of the 18 aa's, 15 are 
identical and the other 3 are similar (Fig. 18G). Both neuropep- 
tides had high homology with previously identified PDH of 
decapod crustaceans (Table 2). Both of the PDH encoding 
transcripts showed significantly higher expression in the eyestalk 
compared with the brain and a higher level in the male brain 
compared with the female brain. 

Prohormone-3 

One transcript was identified to putatively encode a complete 
prohormone-3 peptide precursor with 196 aa (Table 1 and Fig. 19). 
The sequence starts with a 21 aa signal peptide and contains 12 
cysteine residues (Fig. 19), all conserved with other insect 
prohormone-3 sequences, with up to 43% identity in sequence 
(Table 2). Prohormone-3 encoding transcript showed higher 
expression in the eyestalk compared to the brain, with higher 
expression in the male brain compared with the female brain 
(Table 3). 

Prohormone-4 

One transcript was identified to putatively encode a partial C- 
terminus of prohormone-4 peptide precursor with 143 aa (Table 1 
and Fig. 20). The highest homology to an insect species was 89% 
(Table 2). Prohormone-4 encoding transcript showed higher 
expression in the brain compared to the eyestalk, with higher 
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A 1491 atgtctcttaaggcggaggtacgcgtcgtgctggtgggagtcctg 
1446 tgcctgctggccctagcctccctctcgcagg ccgccaccatcacc 
14 01 agcatgtgcatcaggaactgcggccagtgcaaggagatgtacggc 



S M 


C 


I R N 


C 


G Q 


C 



135 6 gactacttccacgggcaggcgtgcgccgagtcctgcatcatgacg 



D Y F H G Q A 


C 


A E S 


C 


I M T 



1311 cagggcgtcagcatcccagactgtaacaaccccgctaccttcaac 
QGVSIPDCNNPATFN 

1266 cgcttcctgaagaggttcatctag 1243 
R F L K R F 1 ^ 



B 216 atgtctggctccagaaaggtcgtggcctcggccctgctggtgctg 
261 agcgtggtgatggtgctgctgctcccctccgtctccg ccgccccc 
30 6 aacaaggtctccctctgcatcaagaattgtgcccagtgtaaggag 



N K V S L 


C 


I K N 


C 


A Q 


C 



351 atgtaccacgcccacttcaagggcggcctctgcgccgacttctgc 
MYHAHFKGGLCADFC 

396 ctccagtccaaaggtcgcttcatgccggactgcggccggcctcat 
LQSKGRFMPDCGRPH 

441 accgtcttgcctttcttcctccagcggctggagtga 476 



logo 

El/ 1-55 
E2/1-57 
conservation 



3- ^ 

2 



. ATITgMl 
QPNKVgLl 




QCKEMY 
QCKEMY 



GDY 
HAH 



IMTQj 
LQSKE 



vsin 

RFMR 
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47 
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Figure 11. Eclosion hormone precursor predicted ORFs and conserved peptide. A, B) Complete ORFs (derived from CL2590.Contig2_AII 
and Unigene55076_AII) of eclosion hormone precursor each starting with a signal peptide (red) followed by an eclosion hormone domain (green) 
with 6 conserved cysteine residues (yellow). Asterisk indicates the stop codon. C) Amino acid alignment between the eclosion hormone domains. 
doi:10.1371/journal.pone.0097323.g011 



expression in the male compared with the female, in both eyestalk 
and brain (Table 3). 

Red pigment concentrating hormone (RPCH) 

One transcript was identified to putatively encode a complete 
RPCH peptide precursor with 99 aa (Table 1 and Fig. 21). The 
sequence starts with a 21 aa signal peptide followed by the 8- 
residue RPCH peptide (with 100% identity to peptides of other 
RPCHs) and RPCH-associated peptide C-terminal domain 
(Fig. 21). The overall prohormoe shared 63% identity with the 
blue swimmer crab Callinedes sapidus RPCH (Table 2). Red 
pigment concentrating hormone encoding transcript showed 
higher expression in the eyestalk compared to the brain (Table 3). 



Sulfakinin 

One transcript was identified to putatively encode a complete 
sulfakinin peptide precursor with 115 aa (Table 1 and Fig. 22). 
The sequence starts with a 27 aa signal peptide followed by two 
sulfakinin putative peptides of 10 aa and 13 aa, separated by 
carboxy-peptidase cleavage sites (Fig. 22). The two peptides had 
high homology with sulfakinin of H. americanus (Table 2). 
Sulfakinin encoding transcript showed higher expression in males 
compared to females both in the brain and the eyestalk (Table 3). 

Tachykinin 

One transcript was identified to putatively encode a complete 
tachykinin peptide precursor with 226 aa (Table 1 and Fig. 23). 
The sequence starts with a 22 aa signal peptide followed by seven 



PLOS ONE I www.plosone.org 



13 



May 2014 | Volume 9 | Issue 5 | e97323 



Rock Lobster: Transcriptome to Neuropeptidome 



A 97 atgcggctcatcgtcgtatgtctggcagttgcgtcagcgacagca 

142 tttgatcttcgcggagaacgggatctctgtgataatgtggagtgt 

FDLRGERDLCDNVE'C 
187 cgagcaggacgtgaatgtgtggtgagccatggcgttgcccattgt 

GRECVVSHGVAHC 
232 cagt gcatccaggtg t gccctgaccactatggccctgtc t gtg gc 

277 tcagatgacaattcctacgataaccac t gcctgcttcaccgcc at 

322 gcc t gtctcaccgtcagttcaaaaactatatctgcatttacca gt 

367 accaccactttacaaaag attatcctaatcattgccataactc tg 
I T T L Q K ^^^^^^^^^^^^^^^^B 

412 tttgtactattaccagcaatatttcaatatgcctacattt atata 

457 aaactgacaccaccatctttgccatcatcacttacatcataa 498 
KLTPPSLPSSLTS* 

B 97 atgcggctcatcgtcgtatgtctggcagttgcgtcagcgacagca 

142 tttgatcttcgcggagaacgggat ctc tgtgataatgtggagtgt 
FDLRGERD ^Icl D N V E C 

187 cgagcaggacgtgaatgtgtggtgagccatggcgttgcccattgt 
RAGRECVVSHGVAHC 

232 cagtgcatccaggtgtgccctgaccactatggccctgtctgtggc 



27 7 tcagatgacaattcctacgataaccac t gcctgcttcaccgcc at 
322 gcc t gtctcaccgaggaacacatcagagttcattacaagggct tc 

3 67 tgcaagaagacaaaacaagtgaaagtaaagccagtgaaaaaggat 

CKKTKQVKVKPVKKD 
412 gagccagctgtgtgctacagcccccagcgtgacgctctccttctc 
EPAVCYSPQRDALLL 

4 57 gtgttggggaagcactggcaagatacacttcaggaacagccgtgg 

VLGKHWQDTLQEQPW 
502 catgtctctggaatgacatatagagaaagtctgtggggacgcttc 

HVSGMTYRESLWGRF 
54 7 ttcacctgtgatgttgataaggataaatatcttgattctgatgag 

FTCDVDKDKYLDSDE 

5 92 ctggttaactgcacctccgatgctttctttatggcacgtcccgag 

LVNCTSDAFFMARPE 
637 caggatcaagaactcaccagggctctatgcgtggatgccattgta 

QDQELTRALCVDAIV 
682 gatatggcagacaccaatcgtgactgg 708 

DMADTNRDW 



logo 

F 1/1-45 
F2/1-45 
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11 



V^SKIVSWST 
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Figure 12. Follistatin precursors predicted ORFs and conserved peptide. A, B) Complete and a partial follistatin precursor predicted ORFs 
(derived from CL3958.Contig2_AII and Unigene49446_AII) each starting with a signal peptide (red) followed by an identical follistatin domain (green) 
with 4 conserved cysteine residues (yellow), followed by a kazal-type serine protease inhibitor domain (pink) with 5-6 cysteine residues (yellow). The 
complete, shorter isoform (A) ends with a predicted transmembrane domain (blue). Asterisk indicates the stop codon. C) Amino acid alignment 
between the kazal-type domains. 
doi:1 0.1 371 /journal.pone.0097323.g01 2 
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17 83 atgcagtggactcgctaccttcttcttaccctggtggtcatgc ag 

1738 gctctga cggaagccaagcgcaagaagaacaacagaactagacaa 

^^^B EAKRKKNNRTRQ 
1693 gacaaaaggaatcagctagagagagctcgtgccgatgaaacggga 

DKRNQLERARADETG 
164 8 actagtgaaatccaactgcctcaagaaggcacagaggctcctcgt 

TSEIQLPQEGTEAPR 
1603 cacagcggagagcatcgccatcggttgcaccactgcgctagctgt 

HSGEHRHRLHHCASC 
155 8 taccagatccgtaagaaattgaggttagcgcagataaaggacaga 

YQIRKKLRLAQIKDR 
1513 gtgttgactgctactggcctgctaactccgccaaacatgaccgga 

VLTATGLLTPPNMTG 
14 68 attgtgatatctaaaaacccaaacatccaagggattattgacgaa 

IVISKNPNIQGIIDE 
142 3 atgaatgcctcctccccccactcgtcctacatgcaggaatctccg 

MNASSPHSSYMQESP 
137 8 tacaataccgacgagccagacatcaagactgagaggatgttttct 

YNTDEPDIKTERMFS 
1333 cccgtcgaaccaggtaacaactactcttcaggctcagcgccaccg 

PVEPGNNYSSGSAPP 
12 8 8 ggtctgaacatccctcccaacttggatatcttgtacttcaaactg 

GLNJPPNLDILYFKL 
12 4 3 aacttcgagcagttgggcaaccgagtcaagagggccatcctgcac 

NFEQLGNRVKRAILH 
1198 gtctggctcaagcctatgcactccgagctggaccggaccgtcccc 

VWLKPMHSELDRTVP 
1153 atctccgtatacaaggtctgccgacctgtcaaccccggaggacac 

tSVYKVCRPVNPGGH 
1108 gtcaccactgttgaggtgacgacggtgtcggagtccttcgacgcc 

VTTVEVTTVSESFDA 
10 63 cgggaggggaactgggtgaagattgaggtgtacaagttgttgcag 

REGNWVKIEVYKLLQ 
1018 gagtggctgaacaagcccgaggacaacctggggcttgtagtctcc 

EWLNKPEDNLGLVVS 
97 3 gccatcgattccgagggacggcaagtggttgtcacagaccccaaa 

AIDSEGRQVVVTDPK 
92 8 gagatgccttccaatgcgccgctgctggagatccacacggaggag 

EMPSNAPLLEIHTEE 
8 83 ggcagaaggagtcgaacccgacgtaacagcgcgagttacgtctgc 

GRRSRTRRNSASYVC 
838 accaacaacattacagacacccgc tgctgcaggtatcgactgg tc 

TNNITDTR 
7 93 gtcgacttcctgcaactaggttgggacttcatcgtcgccccaa ag 

74 8 atatatgaggccaacttttgtaatggcgagtgccccttcctct ac 

7 03 gctcacaagtacgcccacaccacccttatccagaagctgaaca gc 
658 actagcgcccagcacgggccttgctgtggagcgaggaaattat ct 
613 cccatgaaaatgctttactatgatcatgatcaaaaaatcaaat tt 
568 gacacgatccaggacatggtagtggaccgctgtgggtgct cctaa 524 

Figure 13. Myostatin precursor predicted ORF. A complete myostatin predicted ORF (derived from CL1 13.Contig2_AII) starting with a signal 
peptide (red) followed by a TGF-beta propeptide domain (green), followed by another TGF-beta domain domain (pink). Asterisk indicates the stop 
codon. 

doi:1 0.1 371 /journal.pone.0097323.g01 3 
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654 atggtgttccgcaattgctcatggtgctctctcctgctggtgggc 
609 gttgtggtggtggtgggcgtgtgcgcgggcctgggcgagg ccgcc 

5 64 cccccgcccatctgcctgaaccagaagctccccctcagcccctac 

PPPICLNQKLPLSPY 
519 gccaagaagctatgcctcgccctcaccaacatctccaagttctcc 

AKKLCLALTNISKFS 
474 cgagcaatggaggaatatctcgacggtgaagccatcaagaacagt 

RAMEEYLDGEAIKNS 
42 9 ttgcccgtgaacgagccagagatcaagcggcaagacctggaccac 

LPVNEPEI K R Q D L H 

384 gtcttcctgcgcttcggacgatcccagcaatag 352 

VFLRFGRSQQ^ 

Figure 14. Myosupressin precursor predicted complete ORF. A 

complete Myosupressin peptide precursor (derived from Unige- 
ne55051_AII) with a signal peptide (red) and a conserved peptide 
(green) with an amidated glycine (light blue), bordered by carboxyl- 
peptidase cleavage sites. Asterisk indicates the stop codon. 
doi:10.1371/journal.pone.0097323.g014 

identical tachykinin putative peptides of 9 aa each (APSGFLGM- 
Ramide), separated by carboxy-peptidase cleavage sites (Fig. 23). 
This peptide was found to be identical to the tachykinin found in 
P. clarkii (Table 2). Tachykinin encoding transcript showed 
significantly higher expression in the brain compared with the 
eyestalk (Table 3). 

Discussion 

This study has elucidated the putative neuropeptidome of the 
previously uncharacterized Eastern rock lobster S. verreauxi. 
Overall 37 partial and complete transcripts were identified which 
putatively encode 21 peptide families /sub-families (Table 1). 
These included three partial allatostatin type A transcripts, 
where one is presumed to represent the N-terminus (Fig. lA), the 
other is presumed to represent the middle region (Fig. IB) and the 
third is presumed to represent the C -terminus (Fig. IC). It is 
conceivable that these three transcripts are part of a one, larger 
transcript which includes all three, as in most studied arthropod 
species only one type A allatostatin gene was identified [22], 
except for blowflies [23]. Overall there are 22 mature peptides of 8 
aa predicted to arise from the above three transcripts, each 
containing the highly conserved YXFGLamide motif (Fig. ID), 



found in all arthropods type A allatostatins [22]. Two partial 
peptides were identified as the putative N-terminus and C- 
terminus of type B allatostatin precursors (Fig. 2A and B, 
respectively). The level of conservation between the 13 putative 
mature peptides encoded by these transcripts was much lower 
compared with the conservation between the predicted type A 
allatostatins and six are novel (Fig. 2C). Two transcripts were 
identified to encode complete type C allatostatin precursors 
with very low conservation between the two predicted mature 
peptides which include the signature cysteine residues of the type 
C allatostatins (Fig. 3 A, B, C). The latter sequence whose best 
BLAST hit was the predicted prohormone- 1 of the honey bee 
(Table 1) includes the predicted mature peptide which is broadly 
conserved among crustaceans SYWKQCAFNAVSCFamide [24] . 
Most of the mature peptides had very high homology with other 
arthropods, primarily other decapod crustacean species. Most 
prominent was the conservation of type A allatostatine-derived 
peptides with those of the spiny lobster P. interruptus and the 
broadly conserved peptide in prohormone- 1 (Table 2). 

One complete bursicon alpha subunit predicted sequence 
was identified, containing a signal peptide and a predicted C- 
terminal cysteine knot-like domain (Table 1 , Fig. 4) with 1 1 
cysteine residues well conserved with other crustacean and insect 
species, 10 of which are hypothesized to form five disulfide bridges 
[25] . Another transcript is hypothesized to be the N- terminus part 
of a corazonin precursor, comprising a signal peptide, followed 
by the 1 1 aa conserved peptide which is the signature of corazonin 
(QTFQYSRGWTNamide) [26], followed by a carboxy-peptidase 
cleavage site (Table 1 and Fig. 5). Another sequence is predicted to 
encode the crustacean cardioactive peptide precursor (CCAP), 
with 139 aa and high similarity to other crustacean sequences 
(Table 1&2, Fig. 6). 

Five sequences were identified to encode four predicted 
complete and near complete type B CHH precursors (Crustacean 
hyperglycemic hormones) and another unspecified CHH precur- 
sor. The putative peptides were identified to be specific to the 
eyestalk as expected from CHHs and included a signal peptide (in 
4 out of 5 sequences) and a conserved CHH domain (Table 1, 
Fig. 7). Although the occurrence of splice variance-derived 
isoforms of CHH is well documented [27], we currently cannot 
rule out that the high similarity between the 5 sequences identified 
(up to 89% identity) is due, at least in part, to sequencing/ assembly 



12 41 atgcgaggtcacgtgatggcagcggcggtgatggtggtggtggtg 

1196 gtgacgctgctagctcccgtgccctcggccg ccagacacgacagc 

^^^^^^^^^^^^^^^^^^^B R H D S 

1151 tcggcggcggavcgccctccaagccattcacgaggccgccatggct 

SAADALQAIHEAAMA 
1106 ggcatcctgggatccaccgaagtccagtaccctaaccgacccagc 

GILGSTEVQYPNRPS 
10 61 at cttcaagtc cccagtcgaactacggcagtacctcgatgctctc 

I F K S PVELRQYLDAL 
1016 aatgcctactacgctatcgccggcagaccaaggtttggcaagcgg 

NAYYAIAGRPRFG K R 

971 ggaagtcatggtccccagcgaccggaggaaaattacgactattga 927 

GSHGPQRPEENYDY* 

Figure 15. Neuropeptide Y (NPY) precursor predicted complete ORF. A complete NPY precursor (derived from Unigene30121_AII) starting 
with a signal peptide (red) followed by a Pancreatic hormones/neuropeptide F/peptide YY family domain (green) with an amidated glycine (light 
blue). Asterisk indicates the stop codon. 
doi:1 0.1 371 /journal.pone.0097323.g01 5 
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A 1035 atgaggacttcctgctccagcggcgtcaccttcctcctcgcctcc 

MRTSCSSGVTFLLAS 
990 tgttccctcctcctcctcctgcaatatgcagcagccggtcctgcc 

CSLLLLLQYAAAGPA 
94 5 tgtccacaccggaatgaaatagtgaccgaagatctcagccagtgc 

CPHRNEIVTEDLSQC 
90 0 aagtacggcgt tgtgct aggctggtgcggcaacaaggcctgcggc 

K Y G V ^^^F GWCGNKACG 
8 55 aagggcccagaagacacgtgtggaggacgctgggagcagcacggg 

E:gpedtcggrweqhg 

810 atctgcggtgaggggatgtactgcgtgtgtggccactgcgccggg 
ICGEGMYCVCGHCAG 

765 tgctccagcaagctcaagtgtgccctgggcaggttctgctag 724 
CSSKLKCALGRFC^ 



B 142 atgagaacttcaagaaccgccatcaccttcttcgtcgcttccttc 
MRTSRTAITFFVASF 
187 tgccttgctcttcttatccgggaagcgacagcggccccgcgctgc 
CLALLIREATAAPRC 

2 32 aagaaccacgaccagccggccccaagtgactgcaagtacggcgag 

KNHDQPAPSDCKYGE 
277 gtaaaggactggtgccgtaatggagtctgcgctaagggtccagga 

^KDWCRNGVCAKGPG 
322 gagaagtgcggtggacactggtggaaggaaggcaaatgtggccga 

EKCGGHWWKEGKCGR 

3 67 ggaacctactgctcctgcggctactgcactggctgctctgctgtt 

GTYCSCGYCTGCSAV 
412 gtgaatggcgattgctcaccccctacattaatatgttag 450 
VNGDCSPPTLIC* 
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Figure 16. Neuroparsin precursor predicted ORFs and conserved peptide. A, B) Complete neuroparsin precursor predicted ORFs (derived 
from CL2744.Contig6_AII and Unigene5705_AII) each with a neuroparsin domain (green) with 12 conserved cysteine residues (yellow). Asterisk 
indicates the stop codon. C) Amino acid alignment between the neuroparsin domains. 
doi:10.1371/journal.pone.0097323.g016 



errors rather than actual isoforms. Three sequences were identified 
to putatively encode complete isoforms of Molt/ Gonad-inhibiting 
hormone (MIH/GIH). All predicted isoforms included a signal 
peptide followed by a conserved MIH/GIH domain with 



intermediate similarity (up to 54% identity; Table 1, Fig. 8), 
suggesting these are more reliably representing isoforms, com- 
pared with the predicted CHHs. The homology of CHHs and 
MIHs with others identified in decapod crustaceans was in some 
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A 1257 atgtccggggaggtgttcagccttctcttcctcctgagcctctgc 



1212 

1167 agctctcagcacga 

S S Q H D 
1122 cggttcgatgcctt 

R F D A F 
1077 ttcgatgagatcga 

If D E I D 
1032 ttcgatgaaatcga 

F D E I D 
987 ttcgatgagatcga 

F D E I D 
942 ttcgacgagatcga 

F D E I D 
897 ttcgacgagatcga 

F D E I D 
852 ttcgacgagatcga 

F D E I D 
807 tatgatggcgtcta 

Y D G V Y 
762 cgcgccggcttcgg 

R A G F G 
717 atctccaacttgta 

I S N L Y 
672 ggctttggctttgt 



gcctttactgccg ccggccccatcaagcccgctctggccaggccc 
GPIKPALARP 
.cgccgacttcgcagatgctgctcgcataaag 

ADFADAARIK 
cacgactggctttggacacagcaaacgcaac 

T T G F G H S K R N 

ccgctcagggttcgcgtttgccaagaagaac 

R S G F A F A K K N 

ccgagcagggctcggattcgctaagcgcaac 

R A G L G F A K R i 

ccgatctggctttggatttaacaagcgcaat 

R S G F G F N K R N 

ccgcgctggcctcggctttcacaaacgtaac 

R A G L G F H K R N 

ccgatcaggctttggatttaacaagcgcaac 

R S G F G F N K R | 

ccgcaccggtttcggtttccacaagcgagac 

R T G F G F H K R I 

ccctgacaagaggaacttcgacgagatcgac 

P D K R N F D E I q 

cttcgtgaagagagcttttggacccagggac 

F V K R A F G P R D 

taagcgtaactttgatgaaattgatcgttct 

K R N F D E I D R S 

ccgacgcaatgccgagtga 640 
R R N A E * 
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Figure 17. Orcokinin precursor predicted complete ORF and conserved motif. A) A complete ocrcokinin precursor predicted ORF (derived 
from Unigene692_AII) with signal peptide (red) and 11 predicted orcokinin peptides (green), separated by carboxyl-peptidase cleavage sites 
(underlined) Asterisk indicates the stop codon. B) Orcokinin peptides conservation: 11 predicted neuropeptides of 8-13 aa in length with 
NFDEIRDR X GFGF X conserved. 
doi:1 0.1 371 /journal.pone.0097323.g01 7 



cases higher than the homology between the isoforms themselves 
(Table 2), consistent with these genes being diverged for a long 
time. Most CHH and MIH isoforms were found to be expressed 
predominantly in the eyestalk with three of the CHH isoforms and 
one MIH isoform that are most abundantly expressed (Table 3). In 



most isoforms higher expression was found in females, suggesting 
that the females sampled were more advanced in the molt cycle. 
Repeating the neuropeptidome analysis with more samples of 
males and females of distinct molt stages will enable better 
distinction between neuropeptides whose expression change with 
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A 56 atgcgcagcgccgtggccgtagcgatgctggtggtgctggcgatg 

101 gctgccgtcctcacccagg cgcaggagctgaagtaccccgagcgt 

^^^^^^^^^^^B QELKYPER 
14 6 gaggtggtggcagaactggcggcgcagatcctgcgtgtggctcag 

EVVAELAAQILRVAQ 
191 ggaccctggggctccgccgtcgtaggacctcacaagcgcaatgcc 

GPWGSAVVGPH K R N A 

23 6 gaactgatcaactccatcttgggccttcctaaggtgatgaacgac 

ELINSILGLPKVMND 
281 gccggcaggagatag 295 
R R ^ 

B 229 atgcgcagcgccgtggccgtagcgatgctggtggtgctggcgatg 

274 gctgccgtcctcacccagg cgcaggagctgaagtaccccgagcgt 

^^^^^^^^^^^B QELKYPER 
319 gag gtggtggcaggactcgcggccaagatcctgcatctcgccc tg 

364 ggtcctgcgggatacgctgctgtag gaacccagaagcgc aacg cc 
^^^^^^^^^^^^^^^B T Q K R 

409 gage t ga t caac t ccctcctcggcatccccaaggt gat gag tgac 
ELINSLLGIPKVMSD 

454 gccggcagaaggtag 468 
^ G R R ^ 
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Figure 18. PDH precursor predicted complete ORFs and conserved motif. A, B) Two complete PDH precursor predicted ORFs (derived from 
CL7594.Contig2_AII and CL7594.Contig3) each starting with an identical signal peptide (red), a transmembrane region in one isoform (dark blue) and 
a predicted PDH peptide (green), preceded by a carboxyl-peptidase cleavage site (underlined) in each predicted isoform with an amidated glycine 
(light blue). Asterisk indicates the stop codon. C) PDH peptides conservation 15/18 aa are identical with the other 3 similar in characteristics. 
doi:1 0.1 371 /journal.pone.0097323.g01 8 



relation to molt cycle and neuropeptides whose expression change 
between genders. Another sequence which was found to express 
specifically in the eyestalk was predicted to encode a complete 
Crustacean female sex hormone precursor (CFSH; Table 1, 
Fig. 9). CFSH was recently identified in two brachyuran crabs and 
was found to be specifically expressed in the female eyestalk. 
CFSH knock-down was shown to inhibit the appearance of the 
female reproductive characteristics which accompany the terminal 
molt in these species (GenBank Accession ^ ADO00266). 
Interestingly, the putative CFSH in S. verreauxi, identified in this 
study, was found to be specific to the eyestalk although it is present 
also in male eyestalks with the same level of expression as in 
females. 

One transcript was predicted to encode a complete calcitonin- 
like diuretic hormone (DH), with high similarity to the one 
identified in the American lobster H. americanus [28] (Table 1, 
Fig. 10). Two transcripts were predicted to encode two complete 
eclosion hormone precursor isoforms (with 47% identity) each 



starting with a signal peptide and containing 6 conserved cysteine 
residues within their eclosion hormone domain (Table 1, Fig. 11). 
Two transcripts were predicted to encode foUistatin-like 
peptides. Although not considered as neuropeptides, these were 
included here as it might be of interest to further pursue their 
precise functionality in crustaceans. The N-termini of both 
predicted isoforms include identical signal peptides, followed by 
identical foUistatin domains, followed by a kazal-type serine 
protease inhibitor domain whose N- terminus is identical and the 
C- terminus was different (Table 1, Fig. 12). One isoform includes 
a predicted transmembrane region and is a complete ORF 
(Fig. 12A), while the other is longer, without a predicted 
transmembrane region and a partial ORF (Fig. 12B). One 
transcript was identified to encode a complete myostatin 
precursor with the exact same sequence of that identified in the 
penaeid shrimp P. monodon (Table 1, Fig. 13). Although also not 
considered a neuropeptide, like foUistatin, its function in regulating 
muscle development in crustaceans is an interesting aspect to 
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107 4 atgcgctcagtgatgctaggagccatggtcctgctggccgcctgc 



102 9 tggtcccccgcagcag gctggggctatatcttcagcaagttccgg 

^^^^^^^^^^1 WGYIFSKFR 
984 ccagaagcaggacccaactggggctacgggagcgtagggcagcac 

PEAGPNWGYGSVGQH 
93 9 taccagggacccatgggcgagcggatgctgtcgccccaggagcag 

YQGPMGERMLSPQEQ 
8 94 ctgatggaggccctgatggggggagaggaggtgctggaggaacag 

LMEALMGGEEVLEEQ 
84 9 ctgtgcgaggggcgccgctgcacggccaacgaacagtgttgcagc 

LCEGRRCTANEQCCS 
8 04 ggtcacgtctgtgtcgagttcgatggagcctcagggacgtgcatg 

GHVCVEFDGASGTCM 
75 9 ggccagcgtgaaggagctgactgccgcggggactccgagtgcgct 

GQREGADCRGDSECA 
714 gatggacttctttgtcacctgggcgcctgcgtccagtaccaggga 

DGLLCHLGACVQYQG 
669 aagaaacgctacaatgagcagtgtgacgtgagctccgagtgcgac 

KKRYNEQCDVSSECD 
62 4 gttggacgcggcctctgttgccaggtcatccgacgtcatcgccag 

VGRGLCCQVIRRHRQ 
57 9 gcgccaaagacggtgtgtggctacttcaaggacccaatgatctgc 

APKTVCGYFKDPMIC 
534 atcggacacgtagctacggaccaggtaaagacagaaggaggcaag 

IGHVATDQVKTEGGK 
489 cagtaa 484 

Q ^ 



Figure 19. Prohormone-3 precursor predicted complete ORF. A complete prohormone-3 peptide precursor (derived from CL1958.Conti- 
g1_AII) with a signal peptide (red) and 12 cysteine residues (yellow). Asterisk indicates the stop codon. 
doi:1 0.1 371 /journal. pone.0097323.g01 9 



32 8 atgtgcatttccatccagtacctgtgtgacggagccccagattgc 
MCISIQYLCDGAPDC 

37 3 cctgacggatacgacgagaacccacgcctctgcacggcagccaag 
PDGYDENPRLCTAAK 

418 cgtcccccagtagaggagacggcgtccttcctgcagtccctgctg 
RPPVEETASFLQSLL 

4 63 gcatcccacggccccaactaccttgagaagctcttcggcagcaag 

ASHGPNYLEKLFGSK 
508 gcccgcaatgccctcaaggccctgggaggtgtggagcaggttgct 

ARNALKALGGVEQVA 
553 gtcgctctctcagagtcacagaccatcgacgaattcggtgactcc 

VALSESQTIDEFGDS 

5 98 ctgcgtttgttgaggtccgacgtggagcacctgcgttcggtcttc 

LRLLRSDVEHLRSVF 
64 3 atggctgtggagaacggagacatcggcatgctcaagtctctcggc 

MAVENGDIGMLKSLG 
68 8 atcaaggactccgagctgggtgatgtcaagttcttcctggaaaag 

IKDSELGDVKFFLEK 
733 cttgtcaacactggattcctcgactga 759 

LVNTGFLD'^ 

Figure 20. Prohormone-4 precursor predicted partial ORF. A 

partial prohormone-4 peptide precursor (derived from Unige- 
ne1931 1_AII). Asterisk indicates the stop codon. 
doi:1 0.1 371 /journal. pone.0097323.g020 



107 atggttcgtgccggcgtcgcccttcttctggtagtgttggtggtg 

152 gccgccagcgtctcag cccagctcaacttctcaccgggttggggc 
^^^^^^^^^B QLNFSPGWG 

197 aagcgggctgcggcggcggccgccggcggcaccgaccctgccgca 

K R AAAAAAGGTDPAA 

242 gccgccctccgctccccagcagtcctggccgtggggccttcctct 

AALRSPAVLAVGPS S 
2 87 cctgccgtcggggacacctgcggcgccatccccgtctccaccgtc 

PAVGDTCGAIPVSTV 
332 atgcacatctacaggctcatcaggagcgaggcggcgcggcttgcc 

MHIYRLIRSEAARLA 
377 cagtgtcaggacgaggagtacctgggctag 406 

QCQDEEYLG* 

Figure 21. RPCH precursor predicted complete ORF. A complete 
RPCH peptide precursor (derived from Unigene2547_AII) starting with a 
signal peptide (red) followed by a RPCH domain (green) with an 
amidated glycine (blue). Asterisk indicates the stop codon. 
doi:10.1371/journal.pone.0097323.g021 
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61 a tgaggtacacagctaggagcacggcggtgttggtgacggtggc c 

106 gccatcctactgccgtgtgtcgccccggctccgg ccagaccctcc 
^^^^^^^^^^^^^^^^^^^^^B R P S 

151 ctagcacgagctttggttcccgtcgtcagacacagactccaggag 

LARALVPVVRHRLQE 
196 ggtcgcctgccccccgcactggtagaggagctggtgtcggacttc 

GRLPPALVEELVSDF 
241 gaagatccggagctcatggacttccatgatgcggccggcaagaga 

EDPELMDFHDAAG K R 

2 8 6 gagttcgacgagtacggccacatgaggttcggcaagcggagcggg 

EFDEYGHMRFG K R S G 

331 ggcgaatacgacgactatggccacttgcggtttggcaggagcctg 

GEYDDYGHLRFGRSL 
376 aaccacaaccaccacgactcttcacttcactaa 408 

NHNHHDSSLH* 

Figure 22. Sulfakinin precursor predicted complete ORF. A 

complete sulfakinin peptide precursor (derived from Unigene25008_AII) 
starting with a signal peptide (red) followed by two sulfakinin putative 
peptides (green) with an amidated glycine (blue), separated by putative 
carboxy-peptidase cleavage sites (underlined). Asterisk indicates the 
stop codon. 

doi:10.1371/journal.pone.0097323.g022 

pursue and is thus included here. Recently, an opposite role was 
assigned to myostatin in P. monodon compared with vertebrates 
[29]. Based on the identical sequence identified in this study, the 
Eastern rock lobster might serve a good candidate species to revisit 
this hypothesis. A complete myosupressin precursor was 
predicted with a signal peptide and high similarity with H. 
americanus myosupressin (Table 1, Fig. 14). 

One complete predicted neuropeptide Y (NPY) precursor was 
identified with a conserved active peptide sequence (Table 1, 
Fig. 15) and two predicted neuroparsin complete peptide 
precursors were identified with 12 conserved cysteine residues in 
each, but with rather intermediate similarity between them 
(Table 1, Fig. 16). Another predicted neuropeptide, orcokinin 
was identified that included a highly conserved motif of 
NFDEIRDRXGFGFX within its 1 1 predicted mature peptides 
(Table 1, Fig. 17). Two isoforms of the pigment dispersing 
hormone (PDH) precursor were identified with intermediate 
similarity overall. The predicted mature peptide shows high 
similarity between the two sequences (15/18 aa identical). Two 
sequences were predicted to encode complete prohormone-3 
and prohormone-4 precursors (Table 1, Fig. 19, 20). Both have 
been characterized solely in insects, apart from one prohormone-4 
like peptide identified in the copepod Acartia pacifica (GenBank 
accession number AGN29584), hence this is the first report of the 
two hormones in decapods. 

A predicted red pigment concentrating hormone (RPCH) 
precursor was identified with a signal peptide and RPCH domain 
(Table 1, Fig. 21). Another sequence is predicted to encode a 
complete sulfakinin precursor with a signal peptide and two 
mature peptides separated by peptidase cleavage sites (Table 1, 
Fig. 22). Finally, one sequence was identified to putatively encode 
a complete tachykinin precursor with a signal peptide followed 
by seven identical tachykinin peptides, separated by peptidase 
cleavage sites (Table 1, Fig. 23). The tachykinin putative sequence 
had high similarity to the one identified in the spiny lobster P. 
interruptus. 

Diuretic hormone, eclosion hormone, orcokinin, pigment 
dispersing hormone, prohormone-3, prohormone-4 and sulfakinin 
all show higher expression levels in males, while CHH and MIH 
show higher expression levels in females (Table 3). Further analysis 
in precise molt stages is required to validate if these neuropeptides 
have only a role in molt regulation or are also modulating gender- 



307 atgtcttggactggtgcaaggacagtgctggtggtgctcgcccta 
352 gcagcgtgtgtcagccaag cccaggacgccagcgaccgggaacga 

3 97 cgggcgccctccggcttcttgggcatgcggggcaagaaggacgcc 

RAPSGFLGMRG K K D A 

4 42 gcggcgcccctgaacgacgtggacgacgccgccagcgactacccc 

AAPLNDVDDAASDYP 
4 87 gtcctgcccgaccccatcgctgctagactgtacgccttcaggaac 

VLPDPIAARLYAFRN 
532 ggcaacgctcccgtgggtctcgccatgcccttgagaggcaaaaag 

GNAPVGLAMPLRG K K 

57 7 gcaccctctggattccttgggatgcgaggcaagaagagtgatgag 

APSGFLGMRG K K SDK 

622 gaaatctttggtgaggccagcgacgacaatgacttggagactctg 

EIFGEASDDNDLETL 

6 67 cttaagcgtgccccttcaggcttcctgggtatgcgcggcaagaaa 

L K R APSGFLGMRG K K 

712 gctccctcagggttcctgggaatgcggggtaagaaggcaccctct 
APSGFLGMRG K K APS 

7 57 ggtttccttggcatgagaggcaagaaacactatgacgacgatggt 

G F L G M R G K K H Y D D D G 

8 02 gagatggacgccttcatccaggcattgacaacgatgatggacggg 

EMDAFIQALTTMMDG 
84 7 cagcaacagaaacgagctccctctgga tttttgggaatgcgt ggt 

Q Q Q K R APS G !■ L G M K G 

8 92 aaaaaggccatttatggtgatgacacagacgaagagcttaacatg 

K K AIYGDDTDEELNM 

937 gcaggtgtggacaagagagcaccttcaggttttcttggtatgagg 

A G V D K R APSGFLGMR 

982 ggctga 987 

G * 

Figure 23. Tachykinin precursor predicted complete ORF. A 

complete tachykinin peptide precursor (derived from CL7656.Conti- 
g2_AII) starting witli a signal peptide (red) followed by seven identical 
tachykinin putative peptides (green) with an amidated glycine (blue), 
separated by putative carboxy-peptidase cleavage sites (underlined). 
Asterisk indicates the stop codon. 
doi:10.1371/journal.pone.0097323.g023 

derived differences. This study have laid the foundations that will 
enable us to pursue this biological question. 

Conclusions 

This study describes a comprehensive transcriptome of the 
central nervous system of S. verreauxi whose mining led to the 
identification of its putative neuropeptidome. Most of the 
identified neuropeptides had high similarity with previously 
identified neuropeptides, primarily those of other closely-related 
decapod crustaceans. Approximately 21 families and sub-families 
were covered, including neurohormones previously identified in 
other crustacean species as well as two that were previously 
reported primarily in insects and this is the first report of their 
identification in decapod crustaceans (prohomone-3 and 4). 
Mapping and quantification gives insights into the dynamics of 
neuropeptides expression during the molt cycle and with regards 
to gender. 

Materials and Methods 

Animals 

Sagmariasus verreauxi individuals were maintained at Institute for 
Marine and Antarctic Studies under previously described param- 
eters [30]. Prior to dissections, animals were anesthetized on ice 
for at least 20 min. 
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Sample Preparation and Sequencing 

Total RNA from eyestalks and brains of two mature S. verreauxi 
males and two mature females were isolated separately with the 
Trizol Reagent (Invitrogen), according to the manufacturer's 
instructions, followed by next generation sequencing by BGI 
(HongKong Co. Ltd) as per manufacturer's protocol (lUumina, 
San Diego, CA). Briefly, poly (A) mRNA was isolated using oligo 
(dT) beads and the addition of fragmentation buffer for shearing 
mRNA into short fragments (200 700 nt) prevented priming bias 
during the synthesis of cDNA using random hexamer-p rimers. 
The short fragments were further purified using QiaQuick PGR 
extraction kit and resolved with EB buffer for ligation with 
lUumina Paired-end adapters. This was followed by size selection 
(~200 bp), PGR amplification and lUumina sequencing using an 
lUumina Genome Analyzer (HighSeq 2000, lUumina, San Diego, 
GA), performing 90 bp-paired end sequencing. The sequence 
reads were stored as FASTQ^ files. Overall, at least 4 Gb of cleaned 
data (at least 45 mUlion reads) was generated for each of the four 
samples sequenced, which included pooled eyes of two males and 
two females, pooled brains of two males and two females. 

Bioinfornnatics analyses 

Gleaning of low quality reads, assembly and annotation were 
done by BGI, using unpublished algorithms (BGI, HongKong Go. 
Ltd), Trinity [31] and Blast2GO [32], respectively. We validated 
that the reads obtained by BGI are clean using FASTQ/A 
Trimmer (http:/ /hannonlab. cshl.edu/fastx_toolkit/index.html), 
which gave an output of over 99.99% of the reads untrimmed. 
The list of annotated sequences was scanned for key words, 
including names and abbreviations of previously known neuro- 
hormones as well as general key words such as 'hormone'. 
Multiple sequence alignment of the predicted neuropeptide 
sequences was performed with GlustalW [33], foUowed by a 
Neighbor Joining Phylogram (for the GHH sequences) generated 
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via MEGA 5.0 [34] with 1000 bootstrap trials. The multiple 
sequence alignment file was then exported to TexShade [35] for 
highlighting the conserved sequence motifs. Signal peptide was 
predicted using SignalP 4.1 server [36]. Domain prediction was 
done either via SMART [37] or by comparison with references of 
other crustacean neuropeptide sequences. The re-validated clean 
FASTQ^ files were re-assembled using default parameters in GLG 
Genomics Workbench v4 (GLG Bio) and validated the assembled 
transcripts corresponding the neuropeptides using BLAST. Digital 
Gene Expression was computed using GLG Genomics Workbench 
v4 (GLG Bio), with default parameters with the exception of 0.9 
similarity fraction instead of 0.8. Resulting BAM files were 
deposited in the sequence read archive (http://www.ncbi.nlm.nih. 
gov/sra) as biosample SAMN024 19461. BAM files were then 
uploaded onto Partek Genomics Suite (Partek GS) where 
quantification was performed, yielding reads per kilobase per 
million reads (RPKMs). The quantified data was analyzed using 
ANOVA, performed in Partek GS, with contrast between values 
in eye and brain for each neuropeptide. The threshold for 
statistical significance was set to p<0.05. Since there was only one 
male and one female sample for each tissue, no statistical analysis 
was applicable to compare males and females. 
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